Skip to content

ci: bundle all RDNA3/3.5/4 archs#49

Open
Geramy wants to merge 1 commit into
mainfrom
geramy/enable_more_archs
Open

ci: bundle all RDNA3/3.5/4 archs#49
Geramy wants to merge 1 commit into
mainfrom
geramy/enable_more_archs

Conversation

@Geramy

@Geramy Geramy commented Jun 28, 2026

Copy link
Copy Markdown
Member

Bundle all RDNA3 / RDNA3.5 / RDNA4 archs in the CI release

The CI release was building gfx1151-only (ROCM_ARCH), so other AMD GPUs (e.g. gfx1152 Radeon 860M, gfx1201 R9700) shipped with no native device code and fell back to broken/garbage paths.

This sets HIP_BUILD_ARCHS for CMAKE_HIP_ARCHITECTURES to the full set:

  • RDNA3: gfx1100, gfx1101, gfx1102, gfx1103
  • RDNA3.5: gfx1150, gfx1151, gfx1152
  • RDNA4: gfx1200, gfx1201

One release binary bundles them all (HIP fatbin auto-selects the matching ISA at runtime). ROCM_ARCH stays gfx1151 for the runner's ROCm package install; the other archs are cross-compiled by the base toolchain.

Pairs with the matching NripeshN/mlx rocm-support changes (native-WMMA allowlist incl. gfx1152, WMMA prefill default-on, legacy graph-build path disabled).

CI release built gfx1151-only, so other AMD GPUs had no native device code.
Set HIP_BUILD_ARCHS to the full RDNA3 (gfx1100-1103), RDNA3.5 (gfx1150-1152),
and RDNA4 (gfx1200-1201) set for CMAKE_HIP_ARCHITECTURES, so one release binary
bundles them all (HIP selects the matching ISA at runtime). ROCM_ARCH stays
gfx1151 for the runner's package install; the rest are cross-compiled by the
base toolchain.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant